Modules

Most of the functionality in Python is provided by modules. The Python Standard Library is a large collection of modules that provides cross-platform implementations of common facilities such as access to the operating system, file I/O, string management, network communication, and much more.

References

To use a module in a Python program it first has to be imported. A module can be imported using the import statement.

For example, to import the module math, which contains many standard mathematical functions, we can do:


In [3]:
import math

This includes the whole module and makes it available for use later in the program.

For example, we can do:


In [2]:
import math

x = math.cos(2 * math.pi)

print(x)


1.0

Alternatively, we can chose to import all symbols (functions and variables) in a module to the current namespace (so that we don't need to use the prefix "math." every time we use something from the math module:


In [5]:
from math import cos, pi

x = cos(2 * pi)

print(x)


1.0

This is called selective Import

This pattern can be very convenient, but in large programs that include many modules it is often a good idea to keep the symbols from each module in their own namespaces, by using the import math pattern. This would elminate potentially confusing problems with name space collisions.

Btw, in case of namespace collisions (or to avoid namespace pollution) we may use the as keyword


In [6]:
from math import cos as cosine  # Now the `cos` function can be referenced as `cosine`

In [12]:
cosine(pi/2)


Out[12]:
6.123233995736766e-17

Finally, if we want to import everything from a module, we may the * character:


In [13]:
from math import *

In [18]:
print("Cosine Function: ", cos(pi))
print("Sin Function: ", sin(pi))
print("Logarithm: ", log(e))
print("Power function: ", pow(3, 3))


Cosine Function:  -1.0
Sin Function:  1.2246467991473532e-16
Logarithm:  1.0
Power function:  27.0

Looking at what a module contains, and its documentation

Once a module is imported, we can list the symbols it provides using the dir function:


In [19]:
import math

print(dir(math))


['__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'copysign', 'cos', 'cosh', 'degrees', 'e', 'erf', 'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 'hypot', 'isfinite', 'isinf', 'isnan', 'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'log2', 'modf', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'trunc']

And using the function help we can get a description of each function (almost .. not all functions have docstrings, as they are technically called, but the vast majority of functions are documented this way).


In [20]:
help(math.log)


Help on built-in function log in module math:

log(...)
    log(x[, base])
    
    Return the logarithm of x to the given base.
    If the base not specified, returns the natural logarithm (base e) of x.

We can also use the help function directly on modules: Try

help(math) 

Some very useful modules form the Python standard library are os, sys, math, shutil, re, subprocess, multiprocessing, threading.


How Import Works

Because imports are at the heart of program structure in Python, this section goes into more formal detail on the import operation to make this process less abstract.

In more details, this is particularly important as in Python, the import process is not just a textual insertions of one file into another.

It really consists of a set of runtime operations that perform three distinct steps the first time a program imports a given file:

  1. Find the module file
  2. Compile it to byte code
  3. Run the module's code to build the objects it defines

1. Find It

First, Python must locate the (module) file the code is trying to import (omitting extension and directory paths).

To do so, Python applies a proper module search path algorithm. In particular, Python looks for the module to import in the following directories, and in the following order:

  1. The Home Directory of the program
  2. PYTHONPATH directories
  3. Standard library directories
  4. The contents of any .pth file (if present)
  5. The site-packages home of third-party extensions (e.g., /usr/lib/python3/site-packages/)

(More on this, later)

2. Compile it

After finding a source code file that matches an import statement by traversing the module search path, Python next compiles it to byte code, if necessary.

In particular, in this step, the two choices are:

  • Compile:

If the byte code file is older than the source file (i.e., if you’ve changed the source) or was created by a different Python version, Python automatically regenerates the byte code when the program is run.

This model is modified somewhat in Python 3.2+ where byte code files are segregated in a __pycache__ subdirectory and named with their Python version to avoid contention and recompiles when multiple Pythons are installed.

This obviates the need to check version numbers in the byte code, but the timestamp check is still used to detect changes in the source.

  • Don't Compile:

If, on the other hand, Python finds a .pyc byte code file that is not older than the corresponding .py source file and was created by the same Python version, it skips the source-to-byte-code compile step.

In addition, if Python finds only a byte code file on the search path and no source, it simply loads the byte code directly.

This means you can ship a program as just byte code files and avoid sending source.

In other words, the compile step is by-passed if possible to speed program startup.

3. Run it

The final step of an import operation executes the byte code of the module.

All statements in the file are run in turn, from top to bottom, and any assignments made to names during this step generate attributes of the resulting module object.

Modules & Packages

Modules are probably best understood as simply packages of names - i.e., places to define names you want to make visible to the rest of a system.

Technically, modules usually correspond to files, and Python creates a module object to contain all the names assigned in a module file.

But in simple terms, modules are just namespaces (places where names are created), and the names that live in a module are called its attributes.

In addition to a module name, an import can name a directory path.

A directory of Python code is said to be a package, so such imports are known as package imports.

In effect, a package import turns a directory on your computer into another Python namespace, with attributes corresponding to the subdirectories and module files that the directory contains.

Package Relative Imports

The coverage of package imports so far has focused mostly on importing package files from outside the package.

Within the package itself, imports of same-package files can use the same full path syntax as imports from outside the package. However, package files can also make use of special intrapackage search rules to simplify import statements.

That is, rather than listing package import paths, imports within the package can be relative to the package.

The way this works is version-dependent:

Python 2.X implicitly searches package directories first on imports, while Python 3.X requires explicit relative import syntax in order to import from the package directory.

This 3.X change can enhance code readability by making same-package imports more obvious, but it’s also incompatible with 2.X and may break some programs.

Python 3.3 changes

For imports in packages, though, Python 3.X introduces two changes:

  • It modifies the module import search path semantics to skip the package’s own directory by default. Imports check only paths on the sys.path search path. These are known as absolute imports.

  • It extends the syntax of from statements to allow them to explicitly request that imports search the package’s directory only, with leading dots. This is known as relative import syntax.

Examples:

from . import spam

from .spam import name

Why Relative Imports?

Consider the following package directory:

    mypkg\ 
        __init__.py
        main.py 
        string.py

This defines a package named mypkg containing modules named mypkg.main and mypkg.string.

Now, suppose that the main module tries to import a module named string.

In Python 2.X and earlier, Python will first look in the mypkg directory to perform a relative import.

It will find and import the string.py file located there, assigning it to the name string in the mypkg.main module’s namespace. It could be, though, that the intent of this import was to load the Python standard library’s string module instead.

Unfortunately, in these versions of Python, there’s no straightforward way to ignore mypkg.string and look for the standard library’s string module located on the module search path.

On the other hand, in Python 3.3+, you could:

from string import punctuations  # standard library module
from .string import *            # relative import (intrapackage)